Spoiler Alert: Machine Learning Approaches to Detect Social Media Posts with Revelatory Information

نویسندگان

  • Jackie Sauter Zajac
  • Jordan Boyd-Graber
چکیده

Spoilers—critical plot information about works of fiction that “spoil” a viewer’s enjoyment—have prompted elaborate conventions on social media to allow readers to insulate themselves from spoilers. However, these solutions depend on the conscientiousness and rigor of Internet posters and are thus an imperfect system. We create an automatic alternative that could alert users when a piece of text contains a spoiler. An automated spoiler detector serves not only as an additional protection against spoilers, but it also contributes to important problems in computational linguistics. We develop a new dataset of spoilers gathered from social media and create automatic classifiers using machine learning techniques. After establishing baseline performance using lexical features, we develop metadata-based features that substantially improve performance on the spoiler detection task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monitoring Electoral Violence through Social Media : A Machine Learning Approach

In recent years, there has been a growing research interest in social media platforms such as Facebook and Twitter, which have been shown to be influential in reporting news and events around the world [1]. Twitter is a communication platform allowing users to report and comment on a wide range of topics, including political events. Indeed, both citizens and politicians are increasingly embraci...

متن کامل

Detecting and Analyzing Influenza Epidemics with Social Media in China

In recent years, social media has become important and omnipresent for social network and information sharing. Researchers and scientists have begun to mine social media data to predict varieties of social, economic, health and entertainment related real-world phenomena. In this paper, we exhibit how social media data can be used to detect and analyze real-world phenomena with several data mini...

متن کامل

Similarity measurement for describe user images in social media

Online social networks like Instagram are places for communication. Also, these media produce rich metadata which are useful for further analysis in many fields including health and cognitive science. Many researchers are using these metadata like hashtags, images, etc. to detect patterns of user activities. However, there are several serious ambiguities like how much reliable are these informa...

متن کامل

Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media

Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...

متن کامل

Detecting Early Risk of Depression from Social Media User-generated Content

This paper presents the systems developed by the UQAM team for the CLEF eRisk Pilot Task 2017. The goal was to predict as early as possible the risk of mental health issues from user-generated content in social media. Several approaches based on supervised learning and information retrieval methods were used to estimate the risk of depression for a user given the content of its posts in reddit....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013